IEEE 802.15.4.e TSCH-Based Scheduling for Throughput Optimization: A Combinatorial Multi-Armed Bandit Approach
نویسندگان
چکیده
منابع مشابه
Material for ” Combinatorial multi - armed bandit
We use the following two well known bounds in our proofs. Lemma 1 (Chernoff-Hoeffding bound). Let X1, · · · , Xn be random variables with common support [0, 1] and E[Xi] = μ. Let Sn = X1 + · · ·+Xn. Then for all t ≥ 0, Pr[Sn ≥ nμ+ t] ≤ e−2t /n and Pr[Sn ≤ nμ− t] ≤ e−2t /n Lemma 2 (Bernstein inequality). Let X1, . . . , Xn be independent zero-mean random variables. If for all 1 ≤ i ≤ n, |Xi| ≤ k...
متن کاملCombinatorial Multi-Objective Multi-Armed Bandit Problem
In this paper, we introduce the COmbinatorial Multi-Objective Multi-Armed Bandit (COMOMAB) problem that captures the challenges of combinatorial and multi-objective online learning simultaneously. In this setting, the goal of the learner is to choose an action at each time, whose reward vector is a linear combination of the reward vectors of the arms in the action, to learn the set of super Par...
متن کاملMULTI–ARMED BANDIT FOR PRICING Multi–Armed Bandit for Pricing
This paper is about the study of Multi–Armed Bandit (MAB) approaches for pricing applications, where a seller needs to identify the selling price for a particular kind of item that maximizes her/his profit without knowing the buyer demand. We propose modifications to the popular Upper Confidence Bound (UCB) bandit algorithm exploiting two peculiarities of pricing applications: 1) as the selling...
متن کاملNonstochastic Multi-Armed Bandit Approach to Stochastic Discrete Optimization
We present a sampling-based algorithm for solving stochastic discrete optimization problems based on Auer et al.’s Exp3 algorithm for “nonstochastic multi-armed bandit problems.” The algorithm solves the sample average approximation (SAA) of the original problem by iteratively updating and sampling from a probability distribution over the search space. We show that as the number of samples goes...
متن کاملCombinatorial Multi-Armed Bandit with General Reward Functions
In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework that allows a general nonlinear reward function, whose expected value may not depend only on the means of the input random variables but possibly on the entire distributions of these variables. Our framework enables a much larger class of reward functions such as the max() function and nonlinear utility fun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Sensors Journal
سال: 2020
ISSN: 1530-437X,1558-1748,2379-9153
DOI: 10.1109/jsen.2019.2941012